This notebook outlines my process of tree based and Neural Network models. This notebook is dependent on the data table gameInfo generated from DataExtraction.RMD.
Loading Data from other part
load("../data/league.RDATA")
Packages
library(tidyverse)
library(data.table)
library(randomForest)
library(rpart.plot)
library(word2vec)
library(Rtsne)
library(plotly)
library(keras)
library(tfruns)
library(rsample)
List to Store Results
data.tree <- list(
models = list(),
plots = list(),
temp.data = list()
)
championCluster <- list(
models = list(),
plots = list(),
temp.data = list()
)
Wrangling Data
So I want to make a basic tree classifier of projected winning team comps. For now, a basic model of simple champion tags will be used.
Setting up Training / Test Data
# Setting Seed for Reproducibility
set.seed(3)
data.tree$temp.data$sample <- sample(data.tree$temp.data$gameInfo.tree$match, nrow(data.tree$temp.data$gameInfo.tree)*.7)
data.tree$temp.data$train <- data.tree$temp.data$gameInfo.tree %>%
filter(match %in% data.tree$temp.data$sample)
data.tree$temp.data$test <- data.tree$temp.data$gameInfo.tree %>%
filter(!match %in% data.tree$temp.data$sample)
Generating Random Forest
set.seed(3)
data.tree$models$teamComp_forest <- randomForest(
team_win ~ . - match,
data = data.tree$temp.data$train,
ntree = 500,
importance = TRUE,
na.action = na.omit
)
data.tree$models$teamComp_forest
Call:
randomForest(formula = team_win ~ . - match, data = data.tree$temp.data$train, ntree = 500, importance = TRUE, na.action = na.omit)
Type of random forest: classification
Number of trees: 500
No. of variables tried at each split: 3
OOB estimate of error rate: 50.08%
Confusion matrix:
1 2 class.error
1 11396 12530 0.5236981
2 11377 12438 0.4777241
importance(data.tree$models$teamComp_forest)
1 2 MeanDecreaseAccuracy MeanDecreaseGini
Assassin_1 9.9540136 -6.814106 5.1561345 332.2679
Fighter_1 17.2205799 -13.858278 5.5311721 346.1510
Marksman_1 9.5225581 -8.116976 1.7123134 293.4874
Tank_1 7.4799838 -4.724412 3.9559451 285.2230
Mage_1 16.9019553 -14.924233 2.9464378 284.1115
Support_1 12.2424356 -11.952428 1.8623555 242.4686
Assassin_2 0.3371864 -2.088303 -2.3073435 359.3653
Fighter_2 2.4827423 -4.897604 -2.7091299 425.1212
Marksman_2 3.2506393 -4.942703 -1.7623174 369.9343
Tank_2 3.0022087 -5.144555 -2.4517398 338.7178
Mage_2 1.0833946 -1.633843 -0.6288478 389.8527
Support_2 0.6855808 -5.485852 -5.9403592 288.7111
varImpPlot(data.tree$models$teamComp_forest)

Let’s compare to a simple blue side always wins classifier:
data.tree$temp.data$gameInfo.tree %>%
count(team_win) %>%
mutate(n = n/sum(n))
Well, it’s slightly better than the naive blue side win classifier but clearly the number of champions with tags isn’t a very strong predictor of team success. With the current coding, I’m fairly certain that there won’t really be a robust classifier.
Let’s try to identify clusters of champion types. # Generating Input Team Sentences
championCluster$temp.data$teams <- gameInfo %>%
select(match, win, championName) %>%
group_by(match, win) %>%
mutate(championNumber = row_number()) %>%
pivot_wider(
names_from = championNumber,
values_from = championName
) %>%
transmute(match = match, win = win, team = str_c(`1`,`2`,`3`,`4`,`5`, sep = " ")) %>%
ungroup() %>%
select(team)
championCluster$temp.data$teams
Generating Model
Pretty clearly 5 main clusters of champions each corresponding to a role. Doesn’t really help too much in determining team compositions. I could set up a KNN to verify this but it seems pretty clear cut to me.
Neural Network
Wrangle Data
Running Model - See TeamCompNN.R
Hyperparameter Tuning
view_run("runs/2021-12-20T22-31-29Z")
Warning in utils::untar(source_tarball, exdir = source_tmp_dir, compressed = TRUE, :
argument 'compressed' is ignored for the internal method
Warning in readLines(file.path(source_dir, file)) :
incomplete final line found on 'C:\Users\tzhan\AppData\Local\Temp\RtmpCKtyCd\file1f402f424b2b/source/TeamCompNN.R'
results
loss accuracy
0.6917616 0.5213519
52.14% accuracy, not the best, but not bad considering the variance of league of legends.
model %>% predict("Camille Talon Veigar Jihn Lux")
[,1]
[1,] 0.5433422
A very weird way to code a team comp predictor - I’ll try a different method in Part 3.
LS0tDQp0aXRsZTogIlRyZWVzIGFuZCBTdXBwb3J0IFZlY3RvciBNYWNoaW5lcyINCm91dHB1dDogaHRtbF9ub3RlYm9vaw0KLS0tDQoNClRoaXMgbm90ZWJvb2sgb3V0bGluZXMgbXkgcHJvY2VzcyBvZiB0cmVlIGJhc2VkIGFuZCBOZXVyYWwgTmV0d29yayBtb2RlbHMuIFRoaXMgbm90ZWJvb2sgaXMgZGVwZW5kZW50IG9uIHRoZSBkYXRhIHRhYmxlIGdhbWVJbmZvIGdlbmVyYXRlZCBmcm9tIERhdGFFeHRyYWN0aW9uLlJNRC4NCg0KIyBMb2FkaW5nIERhdGEgZnJvbSBvdGhlciBwYXJ0DQpgYGB7cn0NCmxvYWQoIi4uL2RhdGEvbGVhZ3VlLlJEQVRBIikNCmBgYA0KDQoNCiMgUGFja2FnZXMNCmBgYHtyfQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpsaWJyYXJ5KGRhdGEudGFibGUpDQpsaWJyYXJ5KHJhbmRvbUZvcmVzdCkNCmxpYnJhcnkocnBhcnQucGxvdCkNCmxpYnJhcnkod29yZDJ2ZWMpDQpsaWJyYXJ5KFJ0c25lKQ0KbGlicmFyeShwbG90bHkpDQpsaWJyYXJ5KGtlcmFzKQ0KbGlicmFyeSh0ZnJ1bnMpDQpsaWJyYXJ5KHJzYW1wbGUpDQpgYGANCg0KIyBMaXN0IHRvIFN0b3JlIFJlc3VsdHMNCmBgYHtyfQ0KZGF0YS50cmVlIDwtIGxpc3QoDQogIG1vZGVscyA9IGxpc3QoKSwNCiAgcGxvdHMgPSBsaXN0KCksDQogIHRlbXAuZGF0YSA9IGxpc3QoKQ0KKQ0KY2hhbXBpb25DbHVzdGVyIDwtIGxpc3QoDQogIG1vZGVscyA9IGxpc3QoKSwNCiAgcGxvdHMgPSBsaXN0KCksDQogIHRlbXAuZGF0YSA9IGxpc3QoKQ0KKQ0KYGBgDQoNCg0KIyBXcmFuZ2xpbmcgRGF0YQ0KU28gSSB3YW50IHRvIG1ha2UgYSBiYXNpYyB0cmVlIGNsYXNzaWZpZXIgb2YgcHJvamVjdGVkIHdpbm5pbmcgdGVhbSBjb21wcy4gRm9yIG5vdywgYSBiYXNpYyBtb2RlbCBvZiBzaW1wbGUgY2hhbXBpb24gdGFncyB3aWxsIGJlIHVzZWQuDQpgYGB7cn0NCmRhdGEudHJlZSR0ZW1wLmRhdGEkZ2FtZUluZm8udGVtcCA8LSBnYW1lSW5mbyAlPiUgDQogIGxlZnRfam9pbigNCiAgICBjaGFtcGlvbnMuc2NyYXBlZCwNCiAgICBieSA9IGMoImNoYW1waW9uTmFtZSIgPSAibmFtZSIpDQogICkgJT4lIA0KICBncm91cF9ieShtYXRjaCkgJT4lIA0KICBtdXRhdGUoDQogICAgdGVhbSA9IHJsZWlkKHdpbikNCiAgKSAlPiUgDQogIHVuZ3JvdXAoKQ0KDQpkYXRhLnRyZWUkdGVtcC5kYXRhJGdhbWVJbmZvLnRhZ3MgPC0gZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50ZW1wICU+JSANCiAgZ3JvdXBfYnkobWF0Y2gsIHRlYW0pICU+JSANCiAgY291bnQodGFnKSAlPiUgDQogIHVuZ3JvdXAoKSAlPiUgDQogIHBpdm90X3dpZGVyKA0KICAgIG5hbWVzX2Zyb20gPSB0YWcsDQogICAgdmFsdWVzX2Zyb20gPSBuDQogICkgJT4lIA0KICBwaXZvdF93aWRlcigpICU+JSANCiAgcmVwbGFjZShpcy5uYSguKSwgMCkgDQoNCg0KZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50cmVlIDwtIGRhdGEudHJlZSR0ZW1wLmRhdGEkZ2FtZUluZm8udGVtcCAlPiUgDQogIGZpbHRlcih3aW4gPT0gVFJVRSkgJT4lIA0KICBzZWxlY3QobWF0Y2gsIHRlYW1fd2luID0gdGVhbSkgJT4lIA0KICBkaXN0aW5jdChtYXRjaCwgLmtlZXBfYWxsID0gVCkgJT4lIA0KICBtdXRhdGUoDQogICAgdGVhbV93aW4gPSBmYWN0b3IodGVhbV93aW4sIGxldmVscyA9IGMoMSwgMikpDQogICkgJT4lIA0KICBsZWZ0X2pvaW4oDQogICAgZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50YWdzICU+JSANCiAgICAgIGZpbHRlcih0ZWFtID09IDEpICU+JSANCiAgICAgIHJlbmFtZV93aXRoKA0KICAgICAgICAuZm4gPSBmdW5jdGlvbih4KXsNCiAgICAgICAgICANCiAgICAgICAgICBwYXN0ZTAoeCwgIl8xIikgJT4lIA0KICAgICAgICAgICAgcmV0dXJuKCkNCiAgICAgICAgICANCiAgICAgICAgfSwNCiAgICAgICAgLmNvbHMgPSAzOjgNCiAgICAgICkgJT4lIA0KICAgICAgc2VsZWN0KCF0ZWFtKSwNCiAgICBieSA9ICJtYXRjaCINCiAgKSAlPiUgDQogIGxlZnRfam9pbigNCiAgICBkYXRhLnRyZWUkdGVtcC5kYXRhJGdhbWVJbmZvLnRhZ3MgJT4lIA0KICAgICAgZmlsdGVyKHRlYW0gPT0gMikgJT4lIA0KICAgICAgcmVuYW1lX3dpdGgoDQogICAgICAgIC5mbiA9IGZ1bmN0aW9uKHgpew0KICAgICAgICAgIA0KICAgICAgICAgIHBhc3RlMCh4LCAiXzIiKSAlPiUgDQogICAgICAgICAgICByZXR1cm4oKQ0KICAgICAgICAgIA0KICAgICAgICB9LA0KICAgICAgICAuY29scyA9IDM6OA0KICAgICAgKSAlPiUgDQogICAgICBzZWxlY3QoIXRlYW0pLA0KICAgIGJ5ID0gIm1hdGNoIg0KICApICU+JSANCiAgbXV0YXRlX2lmKGlzLmludGVnZXIsIGFzLmZhY3RvcikNCg0KZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50cmVlDQpgYGANCg0KIyBTZXR0aW5nIHVwIFRyYWluaW5nIC8gVGVzdCBEYXRhDQpgYGB7cn0NCiMgU2V0dGluZyBTZWVkIGZvciBSZXByb2R1Y2liaWxpdHkNCnNldC5zZWVkKDMpDQojIE5leHQgdGltZSB1c2UgcnNhbXBsZSANCmRhdGEudHJlZSR0ZW1wLmRhdGEkc2FtcGxlIDwtIHNhbXBsZShkYXRhLnRyZWUkdGVtcC5kYXRhJGdhbWVJbmZvLnRyZWUkbWF0Y2gsIG5yb3coZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50cmVlKSouNykNCmRhdGEudHJlZSR0ZW1wLmRhdGEkdHJhaW4gPC0gZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50cmVlICU+JSANCiAgZmlsdGVyKG1hdGNoICVpbiUgZGF0YS50cmVlJHRlbXAuZGF0YSRzYW1wbGUpDQpkYXRhLnRyZWUkdGVtcC5kYXRhJHRlc3QgPC0gZGF0YS50cmVlJHRlbXAuZGF0YSRnYW1lSW5mby50cmVlICU+JSANCiAgZmlsdGVyKCFtYXRjaCAlaW4lIGRhdGEudHJlZSR0ZW1wLmRhdGEkc2FtcGxlKQ0KYGBgDQoNCiMgR2VuZXJhdGluZyBSYW5kb20gRm9yZXN0DQpgYGB7cn0NCnNldC5zZWVkKDMpDQpkYXRhLnRyZWUkbW9kZWxzJHRlYW1Db21wX2ZvcmVzdCA8LSByYW5kb21Gb3Jlc3QoDQogIHRlYW1fd2luIH4gLiAtIG1hdGNoLA0KICBkYXRhID0gZGF0YS50cmVlJHRlbXAuZGF0YSR0cmFpbiwNCiAgbnRyZWUgPSA1MDAsDQogIGltcG9ydGFuY2UgPSBUUlVFLA0KICBuYS5hY3Rpb24gPSBuYS5vbWl0DQopDQoNCmRhdGEudHJlZSRtb2RlbHMkdGVhbUNvbXBfZm9yZXN0DQpgYGANCmBgYHtyfQ0KaW1wb3J0YW5jZShkYXRhLnRyZWUkbW9kZWxzJHRlYW1Db21wX2ZvcmVzdCkNCnZhckltcFBsb3QoZGF0YS50cmVlJG1vZGVscyR0ZWFtQ29tcF9mb3Jlc3QpDQpgYGANCkxldCdzIGNvbXBhcmUgdG8gYSBzaW1wbGUgYmx1ZSBzaWRlIGFsd2F5cyB3aW5zIGNsYXNzaWZpZXI6DQpgYGB7cn0NCmRhdGEudHJlZSR0ZW1wLmRhdGEkZ2FtZUluZm8udHJlZSAlPiUgDQogIGNvdW50KHRlYW1fd2luKSAlPiUgDQogIG11dGF0ZShuID0gbi9zdW0obikpDQpgYGANCldlbGwsIGl0J3Mgc2xpZ2h0bHkgYmV0dGVyIHRoYW4gdGhlIG5haXZlIGJsdWUgc2lkZSB3aW4gY2xhc3NpZmllciBidXQgY2xlYXJseSB0aGUgbnVtYmVyIG9mIGNoYW1waW9ucyB3aXRoIHRhZ3MgaXNuJ3QgYSB2ZXJ5IHN0cm9uZyBwcmVkaWN0b3Igb2YgdGVhbSBzdWNjZXNzLiBXaXRoIHRoZSBjdXJyZW50IGNvZGluZywgSSdtIGZhaXJseSBjZXJ0YWluIHRoYXQgdGhlcmUgd29uJ3QgcmVhbGx5IGJlIGEgcm9idXN0IGNsYXNzaWZpZXIuDQoNCkxldCdzIHRyeSB0byBpZGVudGlmeSBjbHVzdGVycyBvZiBjaGFtcGlvbiB0eXBlcy4NCiMgR2VuZXJhdGluZyBJbnB1dCBUZWFtIFNlbnRlbmNlcyANCmBgYHtyfQ0KY2hhbXBpb25DbHVzdGVyJHRlbXAuZGF0YSR0ZWFtcyA8LSBnYW1lSW5mbyAlPiUgDQogIHNlbGVjdChtYXRjaCwgd2luLCBjaGFtcGlvbk5hbWUpICU+JSANCiAgZ3JvdXBfYnkobWF0Y2gsIHdpbikgJT4lIA0KICBtdXRhdGUoY2hhbXBpb25OdW1iZXIgPSByb3dfbnVtYmVyKCkpICU+JSANCiAgcGl2b3Rfd2lkZXIoDQogICAgbmFtZXNfZnJvbSA9IGNoYW1waW9uTnVtYmVyLA0KICAgIHZhbHVlc19mcm9tID0gY2hhbXBpb25OYW1lDQogICkgJT4lIA0KICB0cmFuc211dGUobWF0Y2ggPSBtYXRjaCwgd2luID0gd2luLCB0ZWFtID0gc3RyX2MoYDFgLGAyYCxgM2AsYDRgLGA1YCwgc2VwID0gIiAiKSkgJT4lIA0KICB1bmdyb3VwKCkgDQoNCmNoYW1waW9uQ2x1c3RlciR0ZW1wLmRhdGEkdGVhbXMNCmBgYA0KIyBHZW5lcmF0aW5nIE1vZGVsDQpgYGB7cn0NCnNldC5zZWVkKDMpDQpjaGFtcGlvbkNsdXN0ZXIkbW9kZWxzJG5scE1vZGVsIDwtIHdvcmQydmVjKA0KICB4ID0gY2hhbXBpb25DbHVzdGVyJHRlbXAuZGF0YSR0ZWFtcyR0ZWFtLCANCiAgdHlwZSA9ICJza2lwLWdyYW0iLCANCiAgZGltID0gMjAsIA0KICBpdGVyID0gMTUNCikNCg0KIyBFbWJlZGRpbmcgTWF0cml4DQpjaGFtcGlvbkNsdXN0ZXIkbW9kZWxzJGVtYmVkZGluZ01hdHJpeCA8LSBhcy5tYXRyaXgoY2hhbXBpb25DbHVzdGVyJG1vZGVscyRubHBNb2RlbCkNCg0KIyBBcHBseWluZyBUU25lIA0KY2hhbXBpb25DbHVzdGVyJG1vZGVscyRUc25lIDwtIFJ0c25lKGNoYW1waW9uQ2x1c3RlciRtb2RlbHMkZW1iZWRkaW5nTWF0cml4LCBwY2EgPSBGQUxTRSkNCg0KY2hhbXBpb25DbHVzdGVyJHBsb3RzJG1hcCA8LSBjaGFtcGlvbkNsdXN0ZXIkbW9kZWxzJFRzbmUkWSAlPiUgDQogIGFzLmRhdGEuZnJhbWUoKSAlPiUNCiAgbXV0YXRlKGNoYW1waW9uID0gcm93Lm5hbWVzKGNoYW1waW9uQ2x1c3RlciRtb2RlbHMkZW1iZWRkaW5nTWF0cml4KSkgJT4lDQogIGdncGxvdChhZXMoeCA9IFYxLCB5ID0gVjIsIGxhYmVsID0gY2hhbXBpb24pKSArIA0KICBnZW9tX3BvaW50KCkgDQoNCmNoYW1waW9uQ2x1c3RlciRwbG90cyRtYXAgPC0gY2hhbXBpb25DbHVzdGVyJHBsb3RzJG1hcCAlPiUgDQogIGdncGxvdGx5KCkNCg0KY2hhbXBpb25DbHVzdGVyJHBsb3RzJG1hcCANCmBgYA0KUHJldHR5IGNsZWFybHkgNSBtYWluIGNsdXN0ZXJzIG9mIGNoYW1waW9ucyBlYWNoIGNvcnJlc3BvbmRpbmcgdG8gYSByb2xlLiBEb2Vzbid0IHJlYWxseSBoZWxwIHRvbyBtdWNoIGluIGRldGVybWluaW5nIHRlYW0gY29tcG9zaXRpb25zLiBJIGNvdWxkIHNldCB1cCBhIEtOTiB0byB2ZXJpZnkgdGhpcyBidXQgaXQgc2VlbXMgcHJldHR5IGNsZWFyIGN1dCB0byBtZS4NCg0KIyBOZXVyYWwgTmV0d29yaw0KIyMgV3JhbmdsZSBEYXRhDQpgYGB7cn0NCmRhdGEuTk4gPC0gbGlzdCgpDQpkYXRhLk5OJGRhdGEudGVtcCA8LSBjaGFtcGlvbkNsdXN0ZXIkdGVtcC5kYXRhJHRlYW1zICU+JSANCiAgc2VsZWN0KCFtYXRjaCkNCiAgDQpkYXRhLk5OJGRhdGEudGVtcA0KYGBgDQoNCiMgUnVubmluZyBNb2RlbCAtIFNlZSBUZWFtQ29tcE5OLlINCiMjIEh5cGVycGFyYW1ldGVyIFR1bmluZw0KYGBge3J9DQpydW5zIDwtIHR1bmluZ19ydW4oDQogICJUZWFtQ29tcE5OLlIiLA0KICBmbGFncyA9IGxpc3QoDQogICAgZHJvcG91dCA9IGMoMC4yLCAwLjMsIDAuNCwgMC41KSwNCiAgICB1bml0ID0gYyg4LCAxNiwgNjQpDQogICkNCikNCg0KcnVucyAlPiUgDQogIGFycmFuZ2UoZGVzYyhtZXRyaWNfdmFsX2FjY3VyYWN5KSkNCiMgU28gYSBkcm9wb3V0IG9mIC4zIGFuZCA4IHVuaXQgZGVuc2UgbmV0d29yayBzZWVtcyB0byBwcm9kdWNlIHRoZSBiZXN0IHZhbGlkYXRpb24gZXJyb3INCmBgYA0KYGBge3J9DQpyZXN1bHRzDQpgYGANCjUyLjE0JSBhY2N1cmFjeSwgbm90IHRoZSBiZXN0LCBidXQgbm90IGJhZCBjb25zaWRlcmluZyB0aGUgdmFyaWFuY2Ugb2YgbGVhZ3VlIG9mIGxlZ2VuZHMuDQoNCmBgYHtyfQ0KbW9kZWwgJT4lIHByZWRpY3QoIkNhbWlsbGUgVGFsb24gVmVpZ2FyIEppaG4gTHV4IikNCmBgYA0KQSB2ZXJ5IHdlaXJkIHdheSB0byBjb2RlIGEgdGVhbSBjb21wIHByZWRpY3RvciAtIEknbGwgdHJ5IGEgZGlmZmVyZW50IG1ldGhvZCBpbiBQYXJ0IDMuDQo=